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Abstract. CoUapsibility deals with the conditions under which a conditional (on a covariate W) measure 
of association between two random variables X and Y equals the marginal measure of association, under 
the assumption of homogeneity over the covariate. In this paper, we discuss the average CoUapsibility 
of certain well-known measures of association, and also with respect to a new measure of association. 
The concept of average CoUapsibility is more general than CoUapsibility, and requires that the conditional 
average of an association measure equals the corresponding marginal measure. Sufficient conditions for 
the average CoUapsibility of the measures under consideration are obtained. Some difficult, but interest- 
ing, counter-examples are constructed. Applications to linear, Poisson, logistic and negative binomial 
regression models are addressed. An extension to the case of multivariate covariate W is also discussed. 
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1. Introduction 

The study of association between two random variables arises in several applications. Several measures, 
nonparametric in nature, have been proposed in the literature. Often, the random variables of interest, say 
X and Y, may be associated because of their association with another variable W, called a covariate or a 
background variable. In this case, we need to investigate the conditional association measure between X 
and Y given W, and compare it with the marginal association measure between X and Y. It is in general 
possible that the conditional association measure may be positive, while the marginal association measure 
may be negative. Such an effect reversal is called the Yule-Simpson paradox attributed to Yule (1903) 
and Simpson (1951). When Yule-Simpson paradox or the effect reversal does not occur, and a conditional 
measure of association equals the marginal measure, we say that the measure is collapsible over the covari- 
ate W. CoUapsibility is an important issue associated with data analysis, analysis of contingency tables, 
causal inference, regression analysis, epidemiological studies and the design of experiments; see, for ex- 
ample. Cox and Wermuth (2003), Ma et al. (2008), and Xie et al. (2008) for applications and discussions. 



There have been several notions of coUapsibility, namely, simple, strong and uniform coUapsibility. 
These issues have been addressed in several different contexts such as the analysis of contingency ta- 
bles, regression models and association measures; see for example Bishop (1971), Cox (2003), Cox and 
Wermuth (2003), Ceng (1992), Ma et al. (2006), Vellaisamy and Vijay (2008), Wermuth (1987, 1989), 
Whittemore (1978), and Xie et al. (2008). Cox and Wermuth (2003) studied the concept of distribution de- 
pendence and discussed the conditions under which no effect reversal occurs. Xie et al. (2008) discussed 
the simple coUapsibility and the uniform coUapsibility of the following association measures : 

(/) -T-EiY I x) (expectation dependence) 

(it) -T—r- log fix, y) (mixed derivative of interaction) 
oxoy 

d 

(Hi) -T-Fiy I x) (distribution dependence). 

ox 

They discussed also the stringency of the above measures for positive association, studied the conditions 
for no effect reversal (after marginalization over W) and obtained the necessary and sufficient conditions 
for uniform coUapsibility of mixed derivative of interaction, among other results. Recently, Vellaisamy 
(201 1) introduced a new concept of average coUapsibility and discussed it with respect to the distribution 
dependence and the quantile regression coefficients. It is shown that average coUapsibility is a general con- 
cept and coincides with coUapsibility under the condition of homogeneity. In the same spirit, we discuss in 
this paper the average coUapsibility of expectation dependence, and mixed derivative of interaction mea- 
sures which have relevance to linear and logistic regression models. Also, a new measure of association, 
namely, 

(iv) — log E(Y I x) (log expectation dependence) 

ox 

is introduced and its average coUapsibility conditions are investigated. This measure has a direct applica- 
tion to Poisson and negative binomial regression models. In the last section, some results are extended to 
the case of multivariate covariate W. 

2. The Average CoUapsibility Results 

Let (Y, X, W) be a random vector, where our interest is mainly on the association between Y and X, and W 
is treated as a covarite. We assume for simplicity that X and W are continuous, unless stated otherwise. 
Note that Y has a monotone (increasing) regression function of X if E(Y\X = x) is increasing in x or 
equivalently the expectation dependence function (EDF) dE(Y \ x)/dx > 0. We first discuss the average 
coUapsibility results for the EDF and introduce the following definition. 

Definition 1 The expectation dependence function (EDF) is average collapsible over W if 

E^l, W)j = ^E{Y\x), for all x. (1) 



The following result gives sufficient conditions for the average coUapsibility of EDF. In the sequel, 
X \LY and X il y|VF respectively denote the independence of X and Y, and the conditional independence 
of X and Y given W. We assume henceforth all the partial derivatives exist and are continuous so that that 
the differentiation and integration can be interchanged. 

Q 

Theorem 1 The EDF —E{Y\x, w) is average collapsible over W if either 

ox 

(i) E(Y\x, w) is independent ofw, or 

(ii) X ^LW 

holds. 

The condition that E{Y\x,w) is independent of w implies the homogeneity of EDF and in this case 
both uniform coUapsibility (Part (a) of Theorem 3.4 of Xie et al. (2008)) and average coUapsibility hold. 
However, when the EDF is not homogeneous over w, average coUapsibility may still hold if (and only 
if) X ii W. Observe also that the condition E{Y\x, w) is independent of w is a weaker condition than 
Y ii W\X, usually required for other notions of coUapsibility. For example, when W > 0, and {Y\x, w) ~ 
U(x-w,x + w), we have E(Y\x, w) = x. for all w. But, 

F(y\x,w) = ^, x-w <y < x + w, (2) 

showing that Y and W are not conditionally independent given X . 

Some examples for Theorem[T]are the following. Suppose {W\X = x)~ N(x, 1) and {Y\X = x,W = w)~ 
N(x,w). As another example, letX > 0, (W\X = x)~ G(x, 1) and (Y\X = x,W = w)~ G(w,wx), where 
G{a,p) denote the gamma distribution with mean {pi a). In both the cases, E{Y\x, w) = xh independent 
of w and so the average coUapsibility of EDF dE{Y\x, w)/dx holds. 

We next show that condition (i) or (ii) is only sufficient, but not necessary. Hereafter, (f>(z) and 0(z) denote 
respectively the density and the distribution function of Z ~ N{0, 1). 

Example 1 Suppose (Y\X = x,W = w) follows uniform U(0, (x^ + (w - x)^)) so that 

F(y\x, w) = y{x^ + {w - xf)-\ < y < (x^ + {w - xf) (3) 

and E(Y\x, w) = \{x^ + (w - xf). Assume also {W\X = jc) ~ A^(jc, 1) so that 

— /(w|x) = -(p'{w - x) = {w - x)(p{w - x). (4) 
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Then 



r d 1 r" 2 2 

I E(y\x,w)—f(w\x)dw = 7: \ (x + (w - x) )(w - x)^{w - x)dw 
J dx 2 J^^ 



1 

2L 



XCO /~*O0 
(w - x)(f>(w - x)dw +1 (w - x)^(f>(w - x)dw 
00 %J —CO 



1 

2 

= 0, for all ;c. (5) 



XCO 
t(p{t)dt+ I t^(f)(t)dt 
CO \J ~i 



Thus, from (IA.2I) . average coUapsibility over W holds, but neither condition (i) nor condition (ii) is satis- 
fied. 

We next discuss an implication of Theorem[T]to linear regression models. 

Linear regression. Consider the following conditional and marginal linear regression models respec- 
tively: 

a{w) + P{w)x, if W is discrete 
a + px + yw, if W is continuous 
and 



E{Y\X = x,W = w) = 



Then 



and 



E{Y\x) = a+fix. 

d I Biw), if W is discrete 

—E{Y \X = x,W = w) = \^^ 

(JX I if ^ i^ continuous 

^E{Y\x)=p. 
ox 

We say that the regression coefficient p{w) (or ji) is simply collapsible if /3(w) = for all w (or /? = yS). 
Also, it is said to be average collapsible if 

EwMW)) = P (or EwuOS) = h for all x. (6) 

Thus, the average coUapsibility of EDF reduces to the average coUapsibility of regression coefficients, in 
the case of linear regression models. 

The average coUapsibility of regression coefficients >S(w) under the condition E^/i/iJ3{W)) = has been 
discussed by Vellaisamy and Vijay (2007). However, the definition of average coUapsibility given in Q 
is more natural as it involves the joint distribution of W and X. Note also that Ew\x(fiiW)) = for all x 
implies E\y(J3{W)) = yS, but not necessarily conversely. 

Next, we look at the average coUapsibility of mixed derivative of interaction (MDI). Since 

log fix, y) = — — log f(y\x), for aU x and y, (7) 



dxdy dxdy 
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it follows from Proposition 3.2.1 of Whittaker (1990) that 

52 



log fiy\x) = for all x and y <=^ Y W.X. 



dxdy 

In view of (|7]), the MDI henceforth stands for log f{y\x)ldxdy, which motivates the following definition 
of average coUapsibility. 

Definition 2 The MDI is said to be average collapsible over W if 

Ew\x I log f{y\x, W)\ = -^-^ \ogf{y\x), for all (y, x). 
It is assumed that \ogf{y\x) has continuous partial derivatives so that 

log f{y\x) = — — log f{y\x) for all iy, x). 



dxdy dydx 

The following result provides a set of sufficient conditions for the average coUapsibility of MDI. 

Theorem 2 The MDI is average collapsible over W if either 

(i) Y ]L W\X, or 

(ii) X 1LW\Y 

holds. 

Xie et al. (2008)) showed that condition (i) or (ii) in Theorem [2] is necessary and sufficient for uniform 
coUapsibility. The following counter-example shows that they are only sufficient, but not necessary for 
average coUapsibility. 

Example 2 Let X > and (W\x) ~ A^(jc, 1). Assume that 

f(y\x, w) = xy^'-^x^ + (w - xf), < y < (x^ + (w - xfy^'\ (8) 

which can easily be seen to be a valid density. 
Then 

52 1/52 \ 

log f(y\x, w) = - = Ew\. — log f(y\x, W) . (9) 



dxdy y ' \dxdy 

Since iW\x) ~ N{x, 1), it follows that the marginal density of {Y\x) is 

f{y\x, w)f(w\x)dw 

00 

I X ^(w - x)dw +1 {w - x) <p{w - x)dw 

U —CO U —oa 



= xy'^-'ix^ + l). 



which is also a valid density on < j < (x^ + 1) 
Also, it follows from ^ 



52 1/52 

log f(y\x) = -= E^\, -— log f{y\x, W) 



dxdy y ' \dxdy 

Thus, average coUapsibility holds, though the condition (i) is not satisfied. 

It was quite challenging to construct Example [2l as it requires the interchange of log and integration, in 
addition to the other conditions. Observe also that in Example[2l 

d d 

— log f(y\x, w) = — log f(y\x), for all (y, x), 
oy oy 

which leads to the average coUapsibility. This observation leads to the following result which generalizes 
Theorem [2] whose proof is immediate. 

Theorem 3 The MDI is average collapsible over W if either 

(i) — log f(y\x, w) = — log f{y\x), for all (y, x), or 
oy oy 

d d 

(ii) — log f{y\x, w) = — log f(y\x), for all {y, x) 
ox ox 

holds. 

As additional examples for Theorem [3l let f(y\x, w) be as in Example [21 consider, for A > 0, the tempered 
normal density 

tA(w\x) = CAix)e''^*'(p{w - x), for x> 0,w e R, 

where 

= ( r e'-^^Xw - x)dw)' = e(-^-(--'')')/2. 
\J —00 

That is, t;i{w\x) = (p{w - x + A). Then the corresponding marginal density of {Y\x) is 



x^4>{w - x + A)dw +1 (w - jc)20(w - x + A)dw 



fiiy\x) = xy 

= xy'-\x^ + A^ + I), 

which is also a valid density onO < y < {x^ + A^ + l)"'^"^. Thus, the average coUapsibility of MDI holds 
for the family {?^(w|a:)}, A > 0, also. 

Next, we discuss the connection of Theorem [2] to logistic regression models. 



Logistic regression. Let Y be binary and consider the following conditional and marginal logistic regres- 
sion models (Vellaisamy and Vijay (2007), Xie et al. (2008)) considered in the literature: 

f(\\x,w)\ j a(w) + /3(w)x, if V7 is discrete 



log 

and 

lopl 

We say the logistic regression coefficient is simply collapsible if 



f{0\x,w)l I a+/3x + yw, if is continuous 



log \ = a + px. 



fi(w) for all w, if W is discrete 

/3, if W is continuous. 



Also, we say /3{w) or /3 is said to be average collapsible if E-^^j^QSiW)) = fi, when W is discrete and 
Ew\x(J3) = jS, when W is continuous. 

Since Y is binary, the partial derivative is replaced by the difference between the adjacent levels of Y (see 
Cox (2003)) so that 

did \ d 

^h-log/(ykw) = — (log/(lU,w)-log/(Okw)) 

ox \oy J ox 

= ^ log ' 



dx °\/(0|x,w) 
( d 

— (a(w) + P{w)x) = /3{w), if W is discrete 
dx 

o 

—{a + /3x + yw) = /3, ifW is continuous, 
ox 

the logistic regression coefficients corresponding to both the cases of W . 

From Theorem [21 we now conclude that fi(w) or /3 is average collapsible if {i)Y il W\X or {ii)X }L W\Y 
holds. 

Finally, we discuss a new measure called log-expectation dependence (LED) between X and 7 > 0, 
defined by dlog E(Y\x, w)/dx, where it is assumed that < E{Y\x) < oo, for all x. First note that for all x, 

— logEiY\x) = « —EiY\x) = 
ox ox 



y—{dF(y\x))=0 
ox 



dF{y\x) = dF(y\x ) for all y, x and x 
Y ILX. 
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Also, by Theorem 1 of Xie et al. (2008), 

— \ogE{Y\x) > ^ —E{Y\x) > 0^p{Y,X) > 0, 
ox ox 

where p{Y, X) is the correlation coefficient between Y and X. 

Next, we discuss the coUapsibility issues for the LED measure and hence the following definition. 

Definition 3 The LED is simple collapsible if 

d d 

— log E{Y\x, w) = — log E{Y\x), for all x and w (10) 
ox ox 



and average collapsible if 



Ewi. I ^ log E(,Y\x, W)] = ^ log E(Y\x), for all x. (11) 



Theorem 4 The LED is simple collapsible and hence average collapsible ifE(Y\x, w) does not depend on 
w. 

We next discuss relevance of LED in the context of Poisson and negative binomial (NB) regression models. 
Poisson regression. Consider the Poisson regression model defined by 



(Y\X = x,W = w)~ Poi(A(x, w)). 



where the mean 



E(Y\x,w) = A(x,w) 



ga(w)+/3(w).v^ if W is discrete 
^a+/3x+yw^ if W is continuous. 



Then 



d I 6(w), if W is discrete 

— {log E(Y\x,w)) ' 



dx 1 ^ continuous. 

Let {Y\x) ~ Poi{e"'^^^), the marginal Poisson regression model, so that 

log E{Y\x) = a+fix; — log E{Y\x) = fi. 

ox 

Then by Theorem|4l the average coUapsibility of Poisson regression coefficient >S(w) (or /3) holds, that is, 

Ew\.{J3{W))=^ {or Ewi.Q3)=^) 

is true, when A{x, w) does not depend on w which in turn holds when for example 7 = 0. Note that this 
does not in general mean that Y Ji W\X. 

The following interesting example shows that average coUapsibility may hold, even when E{Y\x,w) 
depends on w. 
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Example 3 Let X > and {Y\x, w) ~ P(A{x)w), where Aix) = exp(a + fix). Then E(Y\x, w) = A(x)w and 

^ log EiY\x, w)=p = £w|.v log E{Y\x, W) j . (12) 
Let now {W\x) ~ G{x, x), the gamma distribution with mean unity. Then it is known that 

X 



{Y\x)-^NB\x, 
^ \ x + A{x) 

the negative binomial (NB) distribution with 

^ y\Y(x) \x A{x)) \x ^ A{x)) ^ 

Hence, 

E(Y\x) = A{x)- —\ogE{Y\x) = fi. (13) 
ox 

Thus, from (fT^ and (03]) , the average coUapsibility holds. Note here the covariates W and Z are not 
independent. 

Negative binomial regression. Suppose in Example [3] we assume in addition that the unobservable W is 
independent of X and W ~ G{6, 6). Then again 

{Y\x)-^NB\e,—^—A; E{Y\x) = A{x). (14) 



e + A{x) 

The model (fT4l) is the usual NB regression model. Thus, the average coUapsibility of the LED function 
corresponds to that of the NB regression coefficient fi. It is interesting to note that when the unobserved co- 
variate W follows the gamma distribution with mean unity, the average coUapsibility of the NB regression 
coefficient holds, even when W and X are not independent (Example [3]). Note, however, in the negative 
binomial regression, 

Var{Y\x) = A{x) 1 1 + ^ | > A{x) = E(Y\x), (15) 



e 

unlike the Poisson regression case. Thus, whenever the data exhibits over dispersion (variance exceeds 
mean), the negative binomial regression model is commonly used. 

3. The Multivariate Case 

In this section, we consider an extension to the multivariate case. The case of multivariate response Y may 
be considered by treating one component at a time (Cox and Wermuth (2003) and Xie et al. (2008)) and 
similarly the covariate X may also be considered one component at a time, while keeping other components 
fixed. Therefore, we consider here only the case of multivariate random vector W = {Wi, . . . , Wp). 
A conditional measure of association, say, j;^(E(Y\x, w) is simple collapsible over W if 

—{E{Y\x, w)) = —{E{Y\x)), for aU x and w = (wi , . . . , Wp). 



and average collapsible if 



Ew\. ^{E{Y\x, W)) = —(E(Y\x)), for all x. 



The definition of average coUapsibility of other measures of association remains the same, except that W 
is now a p- variate random vector. 

Let W = (Wi, W2), where Wi has q components and W2 has (p-q) components. We now have the following 
result for the EDF and MDI and the corresponding results for LED follow easily when E(Y\x, w) is 
homogeneous over w. 

Theorem 5 Let Wi IL W^il^ • Then the following results hold : 

(a) The EDF is average collapsible over W if(i) Y ii V7i|(X, W2) and (ii) X IL W2 hold. 

(b) The MDI is average collapsible over W if(i) 7 ii Wi\X and (ii) X IL W2\Y hold. 

By symmetry, the average coUapsibility of MDI holds when X and Y are interchanged in conditions (/) 
and (zz) of Part (b) of Theorem [5l Also, Xie et al. (2008) established the uniform coUapsibility of DDF 
and EDF under an additional condition of homogeneity of these measures. Thus, average coUapsibility 
holds under less restrictive conditions and hence is applicable to a larger class of conditional distributions 
that may arise in practical applications. 
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APPENDIX 



Proof of Theorem\T\ Note that 




(A.l) 



Hence, average coUapsibility holds if and only if 




for all X. 



(A.2) 



Assume now condition (z) holds so that 



£■(71^, w) = h{x), for all x and w, {say). 
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Then 



d 

E{Y\x, w)—f{w\x)dw 
ox 



r d 

h{x) I —f{w\x)dw 
0, for all X. 



Hence, average coUapsibility holds. 

Assume next condition (//) holds. Then obviously, 



d 

E(Y\x, w)—f(w\x)dw = 0, for all x, 
ox 



and so average coUapsibility holds again. 
Proof of Theorem |2] Since 



d 



o n log/(3;|x) = — 
oxoy ox 



fi.y\x) 



average coUapsibility of MDI holds if and only if 



d_ 

dx 



fiy\x, w) 



d_ 

dx 



f{y\x) 



for all iy, x). 



Note that condition {i) implies 



d d 

f{y\x, w) = fiy\x) =^ -^fiyU, w) = —f{y\x), for aU (y, x, w). 

oy oy 



Thus, equation (IA.4I) holds. 
Observe also that 



——\ogf{x,y) = — 
oxoy oy 



( d 



(A.3) 



(A.4) 



\my) 

which is the same as equation (IA.3D with x and y interchanged. Thus, the condition (?'?') also implies the 
average coUapsibility of MDI. 
Proof of Theorem |4] Let 

E{Y\x,w) = hy{x) for all X and w. (A. 5) 



Then 

E{Y\x) = Ew\AEmx, W)) = Ewixihiix)) = h,(x). 
Thus, from (|A.5I) and (IA.6I) . 

E(Y\x) = E(Y\x, w), for all x and w. 



(A.6) 
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and hence simple coUapsibility holds. 
Also, since 



— logEiY\x,w) = —logEim 
ox ox 



the average coUapsibility also holds. 
Proof of Theorem\5\(a) Observe that 



Ewi4-irE{Y\x,W)^ = £ £lJ-EiY\x,w)\dFiwuW2\x) 



dx 



- n 



E(Y\x,W2)dF(wi\x) \ dF(w2\x), (■: Y ^ Wi\{X, W2)) 



, , E(Y\x,wi,W2)\ dF{wi\x)dF(w2\x), (■: Wi IL W2\X) 

■1 



E(Y\x,W2)\ dF(w2\x) 



= Ew,i.{^-^EiY\x,W2) 

= —E(Y\x) for all X, 
dx 



by condition (//) of (a) and Theorem 1. 



(b) First observe that 



52 52 

log fix, y\w) = ——\ogf(x,y,WuW2) 



dxdy dxdy 

dxdy 
52 



\ogf{y\x,Wi,W2) 

\ogf{w\x,W2), (A.7) 



dxdy 

since Y il Wi\X. By the assumption that Wi ii ^"2!^ and (IA.7D 

E^\,(-^\ogf{x,y\W)^ = ^^{^^^-^\ogf{y\x,W2)dF{w,\x)\dF{w2\x) 

\ogfiy\x,W2)\ dF{w2\x) 



dxdy j 
\—-\ogf{y\x,W2) 

log f{y\x) for all x and y 



dxdy 



by condition iii) of Theorem 2. 
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